The Structure of Factor Oracles
نویسندگان
چکیده
The factor oracle is a relatively new data structure for the set of factors of a string. It has been introduced by Allauzen, Crochemore, and Raffinot in 1999. It may recognize nonfactors (hence the name “oracle”) but its implementational simplicity and experimental behaviour are stunning; factor oracle based string matching has been conjectured optimal on average. However, its structure is not well understood. We take important steps in clarifying its structure by explaining how it can be obtained as a quotient of the trie of the set of factors. When seen this way, all known properties of the factor oracle become simple observations. Also, we introduce a framework where various oracles can be compared. The factor oracle is better than several natural ones obtained from the trie of the set of factors, the suffix and the factor automata, respectively.
منابع مشابه
Error analysis of factor oracles
Factor oracles [1] constructed from a given text are deterministic acyclic automata accepting all substrings of the text. Factor oracles are more space economical and easy to implement than similar data structures such as suffix tree[6]. There is, however, some drawback; a factor oracle may accept strings not in the text, which we call a error acceptance. In this paper, we charactrize factor or...
متن کاملConstructing Factor Oracles
A factor oracle is a data structure for weak factor recognition. It is an automaton built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, and has m to 2m−1 transitions. In this paper, we give two alternative algorithms for its construction and prove the constructed automata to be equivalent to the automata constructed by the a...
متن کاملtitle : Finding Maximal Repeats with Factor Oracles
Factor oracles, built from an input text, are automata similar to suffix automata, and accepting at least all substrings of the input text. In papers [LL00] and [LLA02], factor oracles are used to detect repeats on text. Although repeats found with these methods are not maximal, average error is very low and algorithm runs quite fast. In this paper, we present two ideas to improve accuracy of t...
متن کاملWeak Factor Automata: Comparing (Failure) Oracles and Storacles
The factor oracle [3] is a data structure for weak factor recognition. It is a deterministic finite automaton (DFA) built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, is homogeneous, and has m to 2m − 1 transitions. The factor storacle [6] is an alternative automaton that satisfies the same properties, except that its numbe...
متن کاملUsing Factor Oracles for Machine Improvisation
We describe variable markov models we have used for statistical learning of musical sequences, then we present the factor oracle, a data structure proposed by Crochemore & al for string matching. We show the relation between this structure and the previous models and indicate how it can be adapted for learning musical sequences and generating improvisations in a real-time context.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Found. Comput. Sci.
دوره 18 شماره
صفحات -
تاریخ انتشار 2007